Overview
The DynamoEnhance Alignment SDK is designed to enhance the safety and establish robust guardrails around language models, ensuring their outputs are aligned with specified policies and free from harmful content. It includes alignment methods that generate relevant prompts and improve response quality through critique and revision processes. Additionally, helper methods support data generation, similarity detection, and format conversion, making it a comprehensive toolkit for maintaining high standards of model safety and reliability.
Alignment Methods
generate_prompts_relevant_to_policy()
This method generates a list of prompts that are suitable for testing or training language models based on a specified alignment policy. It employs both diverse and in-domain algorithmic processes to ensure a balanced and comprehensive set of prompts. Key parameters include the alignment policy to follow, the domain of the language model, example prompts, the number of prompts to generate, and the ratio of diverse to in-domain prompts.
formulate_critic_revision_prompts()
This function returns a tuple consisting of a critic prompt and a revision prompt. The critic prompt is designed to critique a response for any potentially offensive or harmful content, while the revision prompt is used to revise the response to remove such content and address any problematic assumptions.
write_better_responses_critique()
This method improves base responses for compliance with a given policy. It involves critiquing the base responses, generating improved versions based on the critiques, and filtering out already satisfactory responses. The output is a list of dictionaries containing the prompts, rejected responses, chosen responses, and critiques.
Helper Methods
This method generates JSON output from LLM APIs based on provided prompts. It supports various models and APIs, allowing users to specify parameters such as the temperature for generation, the model to call, the API endpoint, and additional constraints to enforce on the output format.
This function identifies and returns indices of strings within a list that have a cosine similarity above a specified threshold. It uses an LLM to generate embeddings and calculate similarities, making it useful for detecting similar text segments within large datasets.
This utility converts JSONL files to CSV format, facilitating easier data manipulation and analysis. It reads the contents of a JSONL file and writes them into a specified CSV file.
Conversely, this method converts CSV files to JSONL format. It reads the contents of a CSV file and writes them into a specified JSONL file, supporting data transformations for various applications.